Thomson Legal and Regulatory at NTCIR-5: Japanese and Korean Experiments
نویسندگان
چکیده
Thomson Legal and Regulatory participated in the CLIR task of the NTCIR-5 workshop. We submitted formal runs for monolingual retrieval in Japanese and Korean, as well as for bilingual English-to-Japanese retrieval. We employed enhanced tokenization for our Japanese and Korean runs and applied a novel selective pseudo-relevance feedback scheme for Japanese. Our bilingual search participation was a straightforward application of an off-the-shelf Machine Translation system to transform an English query into a Japanese query. Unfortunately we cannot draw many conclusions from our participation, as our experiments were hampered by technical difficulties, particularly with our tokenization and stemming components.
منابع مشابه
Thomson Legal and Regulatory at NTCIR-4: Monolingual and Pivot-Language Retrieval Experiments
Thomson Legal and Regulatory participated in the CLIR task of the NTCIR-4 workshop. We submitted formal runs for monolingual retrieval in Japanese, Chinese and Korean. Our bilingual runs from Chinese and Korean to Japanese rely on English as a pivot language. During our monolingual experiments, we compared building stopword lists using query logs to building stopword lists from collection stati...
متن کاملThomson Legal and Regulatory at NTCIR-3: Japanese, Chinese and English Retrieval Experiments
Thomson Legal and Regulatory participated in the CLIR task of the NTCIR-3 workshop. We submitted formal runs for monolingual retrieval in Japanese and Chinese, and for bilingual retrieval from English to Japanese. Our main focus was in Japanese retrieval. We compared word-based and character-based indexing, as well as query formulation using characters and character bigrams. Our results show th...
متن کاملPOSTECH at NTCIR-5
This paper describes methodologies for NTCIR-5 CLIR involving Korean and Japanese, and reports the official result as well as retrieval results using NTCIR-3 and NTCIR-4 data. We participated in four tasks: K-K and J-J monolingual tracks and K-J and J-K cross-lingual tracks. Unlike English, in Asian languages such as Korean and Japanese term extraction is nontrivial because of segmentation ambi...
متن کاملCJK Experiments with Hummingbird SearchServerTM at NTCIR-5
Hummingbird submitted ranked result sets for the Chinese, Japanese and Korean Single Language Information Retrieval subtasks of the Cross-Lingual Information Retrieval Task of the 5th NII-NACSIS Test Collection for IR Systems Workshop (NTCIR-5). For short Chinese (title) queries, a decompounded wordbased approach produced higher (statistically significant) mean average precision and first relev...
متن کاملNTCIR-5 CLIR Experiments at Oki
We participated in the SLIR, BLIR(PLIR) and MLIR subtasks of the NTCIR-5 CLIR task. Our IR system uses language models for document scoring and query expansion, and can handle four languages; Chinese, Japanese, Korean and English. The system utilizes multiple language resources (bilingual dictionaries, parallel corpora and machine translation systems). We attempted to use some techniques includ...
متن کامل